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g ■ Abstract 

Reputation mechanisms offer an effective alternative to verification authorities for build- 
ing trust in electronic markets with moral hazard. Future clients guide their business de- 
cisions by considering the feedback from past transactions; if truthfully exposed, cheating 
behavior is sanctioned and thus becomes irrational. 

It therefore becomes important to ensure that rational clients have the right incentives 
to report honestly. As an alternative to side-payment schemes that explicitly reward truth- 
ful reports, we show that honesty can emerge as a rational behavior when clients have a 
repeated presence in the market. To this end we describe a mechanism that supports an 
equilibrium where truthful feedback is obtained. Then we characterize the set of pareto- 
optimal equilibria of the mechanism, and derive an upper bound on the percentage of false 
' reports that can be recorded by the mechanism. An important role in the existence of this 

bound is played by the fact that rational clients can establish a reputation for reporting 
C^) ■ honestly. 

i— I ■ 

1. Introduction 

The availability of ubiquitous communication through the Internet is driving the migra- 
tion of business transactions from direct contact between people to electronically mediated 
interactions. People interact electronically either through human-computer interfaces or 
through programs representing humans, so-called agents. In either case, no physical in- 
teractions among entities occur, and the systems are much more susceptible to fraud and 
deception. 

Traditional methods to avoid cheating involve cryptographic schemes and trusted third 
parties (TTP's) that overlook every transaction. Such systems are very costly, introduce 
potential bottlenecks, and may be difficult to deploy due to the complexity and heterogene- 
ity of the environment: e.g., agents in different geographical locations may be subject to 
different legislation, or different interaction protocols. 

Reputation mechanisms offer a novel and effective way of ensuring the necessary level 
of trust which is essential to the functioning of any market. They are based on the observa- 
tion that agent strategies change when we consider that interactions are repeated: the other 
party remembers past cheating, and changes its terms of business accordingly in the future. 
Therefore, the expected gains due to future transactions in which the agent has a higher 
reputation can offset the loss incurred by not cheating in the present. This effect can be am- 
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plified considerably when such reputation information is shared among a large population, 
and thus multiplies the expected future gains made accessible by honest behavior. 

Existing reputation mechanisms enjoy huge success. Systems such as eBay 1 or Amazon 2 
implement reputation mechanisms which are partly credited for the businesses' success. 
Studies show that human users seriously take into account the reputation of the seller when 
placing bids in online auctions (Houser & Wooders, 2006), and that despite the incentive 
to free ride, feedback is provided in more than half of the transactions on eBay (Resnick & 
Zeckhauser, 2002). 

One important challenge associated with designing reputation mechanisms is to ensure 
that truthful feedback is obtained about the actual interactions, a property called incentive- 
compatibility. Rational users can regard the private information they have observed as a 
valuable asset, not to be freely shared. Worse even, agents can have external incentives 
to misreport and thus manipulate the reputation information available to other agents 
(Harmon, 2004). Without proper measures, the reputation mechanism will obtain unreliable 
information, biased by the strategic interests of the reporters. 

Honest reporting incentives should be addressed differently depending on the predomi- 
nant role of the reputation mechanisms. The signaling role is useful in environments where 
the service offered by different providers may have different quality, but all clients inter- 
acting with the same provider are treated equally (markets with adverse selection). This 
is the case, for example, in a market of web-services. Different providers possess different 
hardware resources and employ different algorithms; this makes certain web-services better 
than others. Nevertheless, all requests issued to the same web-service are treated by the 
same program. Some clients might experience worse service than others, but these differ- 
ences are random, and not determined by the provider. The feedback from previous clients 
statistically estimates the quality delivered by a provider in the future, and hence signals 
to future clients which provider should be selected. 

The sanctioning role, on the other hand, is present in settings where service requests 
issued by clients must be individually addressed by the provider. Think of a barber, who 
must skillfully shave every client that walks in his shop. The problem here is that providers 
must exert care (and costly effort) for satisfying every service request. Good quality can 
result only when enough effort was exerted, but the provider is better off by exerting less 
effort: e.g., clients will anyway pay for the shave, so the barber is better off by doing a sloppy 
job as fast as possible in order to have time for more customers. This moral hazard situation 
can be eliminated by a reputation mechanism that punishes providers for not exerting effort. 
Low effort results in negative feedback that decreases the reputation, and hence the future 
business opportunities of the provider. The future loss due to a bad reputation offsets the 
momentary gain obtained by cheating, and makes cooperative behavior profitable. 

There are well known solutions for providing honest reporting incentives for signaling 
reputation mechanisms. Since all clients interacting with a service receive the same quality 
(in a statistical sense), a client's private observation influences her belief regarding the 
experience of other clients. In the web-services market mentioned before, the fact that one 
client had a bad experience with a certain web-service makes her more likely to believe 
that other clients will also encounter problems with that same web-service. This correlation 

1. www.ebay.com 

2. www.amazon.com 
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between the client's private belief and the feedback reported by other clients can be used 
to design feedback payments that make honesty a Nash equilibrium. When submitting 
feedback, clients get paid an amount that depends both on the the value they reported 
and on the reports submitted by other clients. As long as others report truthfully, the 
expected payment of every client is maximized by the honest report - thus the equilibrium. 
Miller, Resnick, and Zeckhauser (2005) and Jurca and Faltings (2006) show that incentive- 
compatible payments can be designed to offset both reporting costs and lying incentives. 

For sanctioning reputation mechanisms the same payment schemes are not guaranteed to 
be incentive-compatible. Different clients may experience different service quality because 
the provider decided to exert different effort levels. The private beliefs of the reporter 
may no longer be correlated to the feedback of other clients, and therefore, the statistical 
properties exploited by Miller et al. (2005) are no longer present. 

As an alternative, we propose different incentives to motivate honest reporting based 
on the repeated presence of the client in the market. Game theoretic results (i.e., the folk 
theorems) show that repeated interactions support new equilibria where present deviations 
are made unattractive by future penalties. Even without a reputation mechanism, a client 
can guide her future play depending on the experience of previous interactions. As a first 
result of this paper, we describe a mechanism that indeed supports a cooperative equilibrium 
where providers exert effort all the time. The reputation mechanism correctly records when 
the client received low quality. 

There are certainly some applications where clients repeatedly interact with the same 
seller with a potential moral hazard problem. The barber shop mentioned above is one 
example, as most people prefer going to the same barber (or hairdresser). Another example 
is a market of delivery services. Every package must be scheduled for timely delivery, 
and this involves a cost for the provider. Some of this cost may be saved by occasionally 
dropping a package, hence the moral hazard. Moreover, business clients typically rely 
on the same carrier to dispatch their documents or merchandise. As their own business 
depends on the quality and timeliness of the delivery, they do have the incentive to form a 
lasting relationship and get good service. Yet another example is that of a business person 
who repeatedly travels to an offshore client. The business person has a direct interest to 
repeatedly obtain good service from the hotel which is closest to the client's offices. 

We assume that the quality observed by the clients is also influenced by environmental 
factors outside the control of, however observable by, the provider. Despite the barber's 
best effort, a sudden movement of the client can always generate an accidental cut that will 
make the client unhappy. Likewise, the delivery company may occasionally lose or damage 
some packages due to transportation accidents. Nevertheless, the delivery company (like 
the barber) eventually learns with certainty about any delays, damages or losses that entitle 
clients to complain about unsatisfactory service. 

The mechanism we propose is quite simple. Before asking feedback from the client, the 
mechanism gives the provider the opportunity to acknowledge failure, and reimburse the 
client. Only when the provider claims good service does the reputation mechanism record 
the feedback of the client. Contradictory reports (the provider claims good service, but the 
client submits negative feedback) may only appear when one of the parties is lying, and 
therefore, both the client and the provider are sanctioned: the provider suffers a loss as a 
consequence of the negative report, while the client is given a small fine. 
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One equilibrium of the mechanism is when providers always do their best to deliver 
the promised quality, and truthfully acknowledge the failures caused by the environmental 
factors. Their "honest" behavior is motivated by the threat that any mistake will drive 
the unsatisfied client away from the market. When future transactions generate sufficient 
revenue, the provider does not afford to risk losing a client, hence the equilibrium. 

Unfortunately, this socially desired equilibrium is not unique. Clients can occasionally 
accept bad service and keep returning to the same provider because they don't have better 
alternatives. Moreover, since complaining for bad service is sanctioned by the reputation 
mechanism, clients might be reluctant to report negative feedback. Penalties for negative 
reports and the clients' lack of choice drives the provider to occasionally cheat in order to 
increase his revenue. 

As a second result, we characterize the set of pareto-optimal equilibria of our mechanism 
and prove that the amount of unreported cheating that can occur is limited by two factors. 
The first factor limits the amount of cheating in general, and is given by the quality of 
the alternatives available to the clients. Better alternatives increase the expectations of the 
clients, therefore the provider must cheat less in order to keep his customers. 

The second factor limits the amount of unreported cheating, and represents the cost 
incurred by clients to establish a reputation for reporting the truth. By stubbornly exposing 
bad service when it happens, despite the fine imposed by the reputation mechanism, the 
client signals to the provider that she is committed to always report the truth. Such signals 
will eventually change the strategy of the provider to full cooperation, who will avoid the 
punishment for negative feedback. Having a reputation for reporting truthfully is of course, 
valuable to the client; therefore, a rational client accepts to lie (and give up the reputation) 
only when the cost of building a reputation for reporting honestly is greater than the 
occasional loss created by tolerated cheating. This cost is given by the ease with which 
the provider switches to cooperative play, and by the magnitude of the fine imposed for 
negative feedback. 

Concretely, this paper proceeds as follows. In Section 2 we describe related work, fol- 
lowed by a more detailed description of our setting in Section 3. Section 4 presents a game 
theoretic model of our mechanism and an analysis of reporting incentives and equilibria. 
Here we establish the existence of the cooperative equilibrium, and derive un upper bound 
on the amount of cheating that can occur in any pareto-optimal equilibrium. 

In Section 5 we establish the cost of building a reputation for reporting honestly, and 
hence compute an upper bound on the percentage of false reports recorded by the reputation 
mechanism in any equilibrium. 

We continue in Section 6 by analyzing the impact of malicious buyers that explicitly 
try to destroy the reputation of the provider. We give some initial approximations on the 
worst case damage such buyers can cause to providers. Further discussions, open issues and 
directions for future work are discussed in Section 7. Finally, Section 8 concludes our work. 

2. Related Work 

The notion of reputation is often used in Game Theory to signal the commitment of a player 
towards a fixed strategy. This is what we mean by saying that clients establish a reputation 
for reporting the truth: they commit to always report the truth. Building a reputation 
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usually requires some incomplete information repeated game, and can significantly impact 
the set of equilibrium points of the game. This is commonly referred to as the reputation 
effect, first characterized by the seminal papers of Kreps, Milgrom, Roberts, and Wilson 
(1982), Kreps and Wilson (1982) and Milgrom and Roberts (1982). 

The reputation effect can be extended to all games where a player (A) could benefit 
from committing to a certain strategy a that is not credible in a complete information 
game: e.g., a monopolist seller would like to commit to fight all potential entrants in a 
chain-store game (Selten, 1978), however, this commitment is not credible due to the cost 
of fighting. In an incomplete information game where the commitment type has positive 
probability, A 7 s opponent (B) can at some point become convinced that A is playing as if 
she were the commitment type. At that point, B will play a best response against a, which 
gives A the desired payoff. Establishing a reputation for the commitment strategy requires 
time and cost. When the higher future payoffs offset the cost of building reputation, the 
reputation effect prescribes minimum payoffs any equilibrium strategy should give to player 
A (otherwise, A can profitably deviate by playing as if she were a commitment type). 

Fudenberg and Levine (1989) study the class of all repeated games in which a long-run 
player faces a sequence of single-shot opponents who can observe all previous games. If the 
long-run player is sufficiently patient and the single-shot players have a positive prior belief 
that the long-run player might be a commitment type, the authors derive a lower bound on 
the payoff received by the long-run player in any Nash equilibrium of the repeated game. 
This result holds for both finitely and infinitely repeated games, and is robust against further 
perturbations of the information structure (i.e., it is independent of what other types have 
positive probability). 

Schmidt (1993) provides a generalization of the above result for the two long-run player 
case in a special class of games called of "conflicting interests" , when one of the players is 
sufficiently more patient than the opponent. A game is of conflicting interests when the 
commitment strategy of one player (A) holds the opponent (B) to his minimax payoff. The 
author derives an upper limit on the number of rounds B will not play a best response to 
A's commitment type, which in turn generates a lower bound on A's equilibrium payoff. For 
a detailed treatment of the reputation effect, the reader is directed to the work of Mailath 
and Samuelson (2006). 

In computer science and information systems research, reputation information defines 
some aggregate of feedback reports about past transactions. This is the semantics we are 
using when referring to the reputation of the provider. Reputation information encompasses 
a unitary appreciation of the personal attributes of the provider, and influences the trusting 
decisions of clients. Depending on the environment, reputation has two main roles: to signal 
the capabilities of the provider, and to sanction cheating behavior (Kuwabara, 2003). 

Signaling reputation mechanisms allow clients to learn which providers are the most 
capable of providing good service. Such systems have been widely used in computational 
trust mechanisms. Birk (2001) and Biswas, Sen, and Debnath (2000) describe systems 
where agents use their direct past experience to recognize trustworthy partners. The global 
efficiency of the market is clearly increased, however, the time needed to build the reputation 
information prohibits the use of this kind of mechanisms in a large scale online market. 

A number of signaling reputation mechanisms also take into consideration indirect rep- 
utation information, i.e., information reported by peers. Schillo, Funk, and Rovatsos (2000) 
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and Yu and Singh (2002, 2003) use social networks in order to obtain the reputation of an 
unknown agent. Agents ask acquaintances several hops away about the trustworthiness of 
an unknown agent. Recommendations are afterwards aggregated into a single measure of 
the agent's reputation. This class of mechanisms, however intuitive, does not provide any 
rational participation incentives for the agents. Moreover, there is little protection against 
untruthful reporting, and no guarantee that the mechanism cannot be manipulated by a 
malicious provider in order to obtain higher payoffs. 

Truthful reporting incentives for signaling reputation mechanisms are described by 
Miller et al. (2005). Honest reports are explicitly rewarded by payments that take into 
account the value of the submitted report, and the value of a report submitted by another 
client (called the reference reporter). The payment schemes are designed based on proper 
scoring rules, mathematical functions that make possible the revelation of private beliefs 
(Cooke, 1991). The essence behind honest reporting incentives is the observation that the 
private information a client obtains from interacting with a provider changes her belief re- 
garding the reports of other clients. This change in beliefs can be exploited to make honesty 
an ex-ante Nash equilibrium strategy. 

Jurca and Faltings (2006) extend the above result by taking a computational approach 
to designing incentive compatible payment schemes. Instead of using closed form scoring 
rules, they compute the payments using an optimization problem that minimizes the total 
budget required to reward the reporters. By also using several reference reports and filtering 
mechanisms, they render the payment mechanisms cheaper and more practical. 

Dellarocas (2005) presents a comprehensive investigation of binary sanctioning reputa- 
tion mechanisms. As in our setting, providers are equally capable of providing high quality, 
however, doing so requires costly effort. The role of the reputation mechanism is to encour- 
age cooperative behavior by punishing cheating: negative feedback reduces future revenues 
either by excluding the provider from the market, or by decreasing the price the provider 
can charge in future transactions. Dellarocas shows that simple information structures and 
decision rules can lead to efficient equilibria, given that clients report honestly. 

Our paper builds upon such mechanisms by addressing reporting incentives. We will ab- 
stract away the details of the underlying reputation mechanism through an explicit penalty 
associated with a negative feedback. Given that such high enough penalties exist, any rep- 
utation mechanism (i.e., feedback aggregation and trusting decision rules) can be plugged 
in our scheme. 

In the same group of work that addresses reporting incentives, we mention the work of 
Braynov and Sandholm (2002), Dellarocas (2002) and Papaioannou and Stamoulis (2005). 
Braynov and Sandholm consider exchanges of goods for money and prove that a market 
in which agents are trusted to the degree they deserve to be trusted is equally efficient as 
a market with complete trustworthiness. By scaling the amount of the traded product, 
the authors prove that it is possible to make it rational for sellers to truthfully declare 
their trustworthiness. Truthful declaration of one's trustworthiness eliminates the need of 
reputation mechanisms and significantly reduces the cost of trust management. However, 
the assumptions made about the trading environment (i.e. the form of the cost function and 
the selling price which is supposed to be smaller than the marginal cost) are not common 
in most electronic markets. 
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For e-Bay-like auctions, the Goodwill Hunting mechanism (Dellarocas, 2002) provides a 
way to make sellers indifferent between lying or truthfully declaring the quality of the good 
offered for sale. Momentary gains or losses obtained from misrepresenting the good's quality 
are later compensated by the mechanism which has the power to modify the announcement 
of the seller. 

Papaioannou and Stamoulis (2005) describe an incentive-compatible reputation mecha- 
nism that is particularly suited for peer-to-peer applications. Their mechanism is similar to 
ours, in the sense that both the provider and the client are punished for submitting conflict- 
ing reports. The authors experimentally show that a class of common lying strategies are 
successfully deterred by their scheme. Unlike their results, our paper considers all possible 
equilibrium strategies and sets bounds on the amount of untruthful information recorded 
by the reputation mechanism. 

3. The Setting 

We assume an online market, where rational clients (she) repeatedly request the same ser- 
vice from one provider (he). Every client repeatedly interacts with the service provider, 
however, successive requests from the same client are always interleaved with enough re- 
quests generated by other clients. Transactions are assumed sequential, the provider does 
not have capacity constraints, and accepts all requests. 

The price of service is p monetary units, and the service can have either high (qi) or 
low (qo) quality. Only high quality is valuable to the clients, and has utility u(qi) = u. 
Low quality has utility 0, and can be precisely distinguished from high quality. Before each 
round, the client can decide to request the service from the provider, or quit the market and 
resort to an outside provider that is completely trustworthy. The outside provider always 
delivers high quality service, but for a higher price p(l + p). 

If the client decides to interact with the online provider, she issues a request to the 
provider, and pays for the service. The provider can now decide to exert low (eo) or high 
(ei) effort when treating the request. Low effort has a normalized cost of 0, but generates 
only low quality. High effort is expensive (normalized cost equals c(e±) = c) and generates 
high quality with probability a < 1. a is fixed, and depends on the environmental factors 
outside the control of the provider, ap > c, so that it is individually rational for providers 
to exert effort. 

After exerting effort, the provider can observe the quality of the resulting service. He 
can then decide to deliver the service as it is, or to acknowledge failure and roll back the 
transaction by fully reimbursing 3 the client. We assume perfect delivery channels, such 
that the client perceives exactly the same quality as the provider. After delivery, the client 
inspects the quality of service, and can accuse low quality by submitting a negative report 
to the reputation mechanism. 

The reputation mechanism (RM) is unique in the market, and trusted by all participants. 
It can oversee monetary transactions (i.e., payments made between clients and the provider) 
and can impose fines on all parties. However, the RM does not observe the effort level 
exerted by the provider, nor does it know the quality of the delivered service. 

3. In reality, the provider might also pay a penalty for rolling back the transaction. As long as this penalty 
is small, the qualitative results we present in this paper remain valid. 
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The RM asks feedback from the client only if she chose to transact with the provider in 
the current round (i.e., paid the price of service to the provider) and the provider delivered 
the service (i.e., provider did not reimburse the client). When the client submits negative 
feedback, the RM punishes both the client and the provider: the client must pay a fine e, 
and the provider accumulates a negative reputation report. 

3.1 Examples 

Although simplistic, this model retains the main characteristics of several interesting ap- 
plications. A delivery service for perishable goods (goods that lose value past a certain 
deadline) is one of them. Pizza, for example, must be delivered within 30 minutes, oth- 
erwise it gets cold and loses its taste. Hungry clients can order at home, or drive to a 
more expensive local restaurant, where they're sure to get a hot pizza. The price of a home 
delivered pizza is p = 1, while at the restaurant, the same pizza would cost p(l + p) = 1.2. 
In both cases, the utility of a warm meal is u = 2. 

The pizza delivery provider must exert costly effort to deliver orders within the deadline. 
A courier must be dispatched immediately (high effort), for an estimated cost of c = 0.8. 
While such action usually results in good service (the probability of a timely delivery is 
a = 99%), traffic conditions and unexpected accidents (e.g., the address is not easily found) 
may still delay some deliveries past the deadline. 

Once at the destination, the delivery person, as well as the client, know if the delivery 
was late or not. As it is common practice, the provider can acknowledge being late, and 
reimburse the client. Clients may provide feedback to a reputation mechanism, but their 
feedback counts only if they were not reimbursed. The client's fine for submitting a negative 
report can be set for example at e = 0.01. The future loss to the provider caused by the 
negative report (and quantified through e) depends on the reputation mechanism. 

A simplified market of car garagists or plumbers could fit the same model. The provider 
is commissioned to repair a car (respectively the plumbing) and the quality of the work 
depends on the exerted effort. High effort is more costly but ensures a lasting result with 
high probability. Low effort is cheap, but the resulting fix is only temporary. In both 
cases, however, the warranty convention may specify the right of the client to ask for a 
reimbursement if problems reoccur within the warranty period. Reputation feedback may 
be submitted at the end of the warranty period, and is accepted only if reimbursements 
didn't occur. 

An interesting emerging application comes with a new generation of web services that 
can optimally decide how to treat every request. For some service types, a high quality 
response requires the exclusive use of costly resources. For example, computation jobs 
require CPU time, storage requests need disk space, information requests need queries to 
databases. Sufficient resources, is a prerequisite, but not a guarantee for good service. 
Software and hardware failures may occur, however, these failures are properly signaled 
to the provider. Once monetary incentives become sufficiently important in such markets, 
intelligent providers will identify the moral hazard problem, and may act strategically as 
identified in our model. 
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4. Behavior and Reporting Incentives 

From game theoretic point of view, one interaction between the client and the provider can 
be modeled by the extensive-form game (G) with imperfect public information, shown in 
Figure 1. The client moves first and decides (at node 1) whether to play in and interact 
with the provider, or to play out and resort to the trusted outside option. 

Once the client plays in, the provider can chose at node 2 whether to exert high or 
low effort (i.e., plays e\ or eo respectively). When the provider plays eo the generated 
quality is low. When the provider plays e±, nature chooses between high quality (q±) with 
probability a, and low quality (qo) with probability 1 — a. The constant a is assumed 
common knowledge in the market. Having seen the resulting quality, the provider delivers 
(i.e., plays d) the service, or acknowledges low quality and rolls back the transaction (i.e., 
plays I) by fully reimbursing the client. If the service is delivered, the client can report 
positive (1) or negative (0) feedback. 

A pure strategy is a deterministic mapping describing an action for each of the player's 
information sets. The client has three information sets in the game G. The first information 
set is singleton and contains the node 1 at the beginning of game when the client must 
decide between playing in or out. The second information set contains the nodes 7 and 8 
(the dotted oval in Figure 1) where the client must decide between reporting or 1, given 
that she has received low quality, qo- The third information set is singleton and contains the 
node 9 where the client must decide between reporting or 1, given that she received high 
quality, q±. The strategy inO qo l qi , for example, is the honest reporting strategy, specifying 
that the client enters the game, reports when she receives low quality, and reports 1 when 
she receives high quality. The set of pure strategies of the client is: 

A c = {outl qo l qi , out l qo qi , outO qo l qi , outO qo qi , ml* l qi , inl qo qi , inO qo l qi , inl qo l qi }; 

Similarly, the set of pure strategies of the provider is: 

A P = {e l,e d 1 e 1 l qo l qi ,e 1 l qo d qi 1 e 1 d qo l qi ,e 1 d qo d qi }; 

where eil qo d qi , for example, is the socially desired strategy: the provider exerts effort at 
node 2, acknowledges low quality at node 5, and delivers high quality at node 6. A pure 
strategy profile s is a pair (sc, sp) where sc € Ac and sp G Ap. If A(A) denotes the set of 
probability distributions over the elements of A, ac € A(Aq) and ap G A(Ap) are mixed 
strategies for the client, respectively the provider, and a = (o"c,<7p) is a mixed strategy 
profile. 

The payoffs to the players depend on the chosen strategy profile, and on the move 
of nature. Let g(a) = (gc{&), 9p{&)) denote the pair of expected payoffs received by 
the client, respectively by the provider when playing strategy profile a. The function g : 
A(Ac) x A(Ap) — > M 2 is characterized in Table 1 and also describs the normal form 
transformation of G. Besides the corresponding payments made between the client and 
the provider, Table 1 also reflects the influence of the reputation mechanism, as further 
explained in Section 4.1. The four strategies of the client that involve playing out at node 1 
generate the same outcomes, and therefore, have been collapsed for simplicity into a single 
row of Table 1. 
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Figure 1: The game representing one interaction. Empty circles represent decision nodes, 
edge labels represent actions, full circles represent terminal nodes and the dotted 
oval represents an information set. Payoffs are represented in rectangles, the top 
row describes the payoff of the client, the second row describes the payoff of the 
provider. 



4.1 The Reputation Mechanism 

For every interaction, the reputation mechanism records one of the three different signals it 
may receive: positive feedback when the client reports 1, negative feedback when the client 
reports 0, and neutral feedback when the provider rolls back the transaction and reimburses 
the client. In Figure 1 (and Table 1) positive and neutral feedback do not influence the 
payoff of the provider, while negative feedback imposes a punishment equivalent to e. 

Two considerations made us choose this representation. First, we associate neutral and 
positive feedback with the same reward (0 in this case) because intuitively, the acknowl- 
edgement of failure may also be regarded as "honest" behavior on behalf of the provider. 
Failures occur despite best effort, and by acknowledging them, the provider shouldn't suffer. 

However, neutral feedback may also result because the provider did not exert effort. The 
lack of punishment for these instances contradicts the goal of the reputation mechanism to 
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Table 1: Normal transformation of the extensive form game, G 



encourage exertion of effort. Fortunately, the action eol can be the result of rational behavior 
only in two circumstances, both excusable: one, when the provider defends himself against 
a malicious client that is expected to falsely report negative feedback (details in Section 
6), and two, when the environmental noise is too big (a is too small) to justify exertion of 
effort. Neutral feedback can be used to estimate the parameter a, or to detect coalitions 
of malicious clients, and indirectly, may influence the revenue of the provider. However, 
for the simplified model presented above, positive and neutral feedback are considered the 
same in terms of generated payoffs. 

The second argument relates to the role of the RM to constrain the revenue of the 
provider depending on the feedback of the client. There are several ways of doing that. 
Dellarocas (2005) describes two principles, and two mechanisms that punish the provider 
when the clients submit negative reports. The first, works by exclusion. After each negative 
report the reputation mechanism bans the provider from the market with probability tt. 
This probability can be tuned such that the provider has the incentive to cooperate almost 
all the time, and the market stays efficient. The second works by changing the conditions 
of future trade. Every negative report triggers the decrease of the price the next iV clients 
will pay for the service. For lower values of N the price decrease is higher, nonetheless, iV 
can take any value in an efficient market. 

Both mechanisms work because the future losses offset the momentary gain the provider 
would have had by intentionally cheating on the client. Note that these penalties are given 
endogenously by lost future opportunities, and require some minimum premiums for trusted 
providers. When margins are not high enough, providers do not care enough about future 
transactions, and will use the present opportunity of cheating. 

Another option is to use exogenous penalties for cheating. For example, the provider 
may be required to buy a licence for operating in the market 4 . The licence is partially 
destroyed by every negative feedback. Totaly destroyed licences must be restored through 
a new payment, and remaining parts can be sold if the provider quits the market. The 
price of the licence and the amount that is destroyed by a negative feedback can be scaled 

4. The reputation mechanism can buy and sell market licences 
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such that rational providers have the incentive to cooperate. Unlike the previous solutions, 
this mechanism does not require minimum transaction margins as punishments for negative 
feedback are directly subtracted from the upfront deposit. 

One way or another, all reputation mechanisms foster cooperation because the provider 
associates value to client feedback. Let V(R + ) and V(R~) be the value of a positive, respec- 
tively a negative report. In the game in Figure 1, V(R + ) is normalized to 0, and V(R~) 
is e. By using this notation, we abstract away the details of the reputation mechanism, 
and retain only the essential punishment associated with negative feedback. Any reputa- 
tion mechanism can be plugged in our scheme, as long as the particular constraints (e.g., 
minimum margins for transactions) are satisfied. 

One last aspect to be considered is the influence of the reputation mechanism on the 
future transactions of the client. If negative reports attract lower prices, rational long-run 
clients might be tempted to falsely report in order to purchase cheaper services in the future. 
Fortunately, some of the mechanisms designed for single-run clients, do not influence the 
reporting strategy of long-run clients. The reputation mechanism that only keeps the last 
N reports (Dellarocas, 2005) is one of them. A false negative report only influences the 
next N transactions of the provider; given that more than N other requests are interleaved 
between any two successive requests of the same client, a dishonest reporter cannot decrease 
the price for her future transactions. 

The licence-based mechanism we have described above is another example. The price 
of service remains unchanged, therefore reporting incentives are unaffected. On the other 
hand, when negative feedback is punished by exclusion, clients may be more reluctant to 
report negatively, since they also lose a trading partner. 

4.2 Analysis of Equilibria 

The one-time game presented in Figure 1 has only one subgame equilibrium where the client 
opts out. When asked to report feedback, the client always prefers to report 1 (reporting 
attracts the penalty e). Knowing this, the best strategy for the provider is to exert low 
effort and deliver the service. Knowing the provider will play eod, it is strictly better for 
the client to play out. 

The repeated game between the same client and provider may, however, have other 
equilibria. Before analyzing the repeated game, let us note that every interaction between a 
provider and a particular client can be strategically isolated and considered independently. 
As the provider accepts all clients and views them identically, he will maximize his expected 
revenue in each of the isolated repeated games. 

From now on, we will only consider the repeated interaction between the provider and 
one client. This can be modeled by a T-fold repetition of the stage game G, denoted G T , 
where T is finite or infinite. In this paper we will deal with the infinite horizon case, however, 
the results obtained can also be applied with minor modifications to finitely repeated games 
where T is large enough. 

If 5 is the per period discount factor reflecting the probability that the market ceases 
to exist after each round, (or the present value of future revenues), let us denote by 5 the 
expected discount factor in the game G T . If our client interacts with the provider on the 
average every N rounds, 5 = 5 N . 
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The life-time expected payoff of the players is computed as: 

T 
r=0 

where i G {C, P} is the client, respectively the provider, gj is the expected payoff obtained 
by player i in the r th interaction, and 5 T is the discount applied to compute the present day 
value of gj . 

We will consider normalized life-time expected payoffs, so that payoffs in G and G T can 
be expressed using the same measure: 

T 

^ = (1-5)^91; (i) 

T=0 

We define the average continuation payoff for player i from period t onward (and in- 
cluding period t) as: 

V/ = (l-<5)fy (2) 

T = t 

The set of outcomes publicly perceived by both players after each round is: 

Y = {out, Z,q l,<ZoO,<7il,<7iO} 

where: 

• out is observed when the client opts out, 

• I is observed when the provider acknowledges low quality and rolls back the transac- 
tion, 

• qij is observed when the provider delivers quality qi £ {qo, q\} and the client reports 
JG{0,1}. 

We denote by h l a specific public history of the repeated game out of the set H l = (xY)* of 
all possible histories up to and including period t. In the repeated game, a public strategy 
o; t of player i is a sequence of maps (o~j), where o~\ : H l ~ l — > A(Ai) prescribes the (mixed) 
strategy to be played in round t, after the public history h t ^ 1 £ H t—1 . A perfect public 
equilibrium (PPE) is a profile of public strategies a = (o"c,(Tp) that, beginning at any time 
t and given any public history h t_1 , form a Nash equilibrium from that point on (Fudenberg, 
Levine, & Maskin, 1994). V^(a) is the continuation payoff to player i given by the strategy 
profile a. 

G is a game with product structure since any public outcome can be expressed as a vector 
of two components {vcUp) such that the distribution of y« depends only on the actions of 
player i € {C, P}, the client, respectively the provider. For such games, Fudenberg et al. 
(1994) establish a Folk Theorem proving that any feasible, individually rational payoff 
profile is achievable as a PPE of G°° when the discount factor is close enough to 1. The set 
of feasible, individually rational payoff profiles is characterized by: 

• the minimax payoff to the client, obtained by the option out: Vc = u — p{l + p); 
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Figure 2: The pareto-optimal frontier of the set of feasible, individually rational payoff 
profiles of G. 



• the minimax payoff to the provider, obtained when the provider plays e$l: Vp = 0; 

• the pareto optimal frontier (graphically presented in Figure 2) delimited by the 
payoffs given by (linear combination of) the strategy profiles (inl qo l qi , e\l qo d qi ), 
(inl qo l q \ ei d qo d qi ) and {inl qo l qi ,e d). 

and contains more than one point (i.e., the payoff when the client plays out) when a(u— p) > 
u — p(l + p) and ap — c > 0. Both conditions impose restrictions on the minimum margin 
generated by a transaction such that the interaction is profitable. The PPE payoff profile 
that gives the provider the maximum payoff is (Vc, Vp) where: 



and Vc_ is defined above. 

While completely characterizing the set of PPE payoffs for discount factors strictly 
smaller than 1 is outside the scope of this paper, let us note the following results: 

First, if the discount factor is high enough (but strictly less than 1) with respect to the 
profit margin obtained by the provider from one interaction, there is at least one PPE such 
that the reputation mechanism records only honest reports. Moreover, this equilibrium is 
pareto-optimal. 

Proposition 1 When 5 > p ^i_^ a ^__ c , the strategy profile: 

• the provider always exerts high effort, and delivers only high quality; if the client 
deviates from the equilibrium , the provider switches to eod for the rest of the rounds; 




u(l — a) 
V 

u( 1 — a) 
P 
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• the client always reports 1 when asked to submit feedback; if the provider deviates, 
(i.e., she receives low quality), the client switches to out for the rest of the rounds. 

is a pareto- optimal PPE. 

Proof. It is not profitable for the client to deviate from the equilibrium path. Reporting 
attracts the penalty e in the present round, and the termination of the interaction with 
the provider (the provider stops exerting effort from that round onwards). 

The provider, on the other hand, can momentarily gain by deviating to e\d qo d qi or eod. 
A deviation to eid qo d qi gives an expected momentary gain of p(l — a) and an expected 
continuation loss of (1 — a)(ap — c). A deviation to eod brings an expected momentary 
gain equal to (1 — a)p + c and an expected continuation loss of ap — c. For the discount 
factor satisfying our hypothesis, both deviations are not profitable. The discount factor 
is low enough with respect to profit margins, such that the future revenues given by the 
equilibrium strategy offset the momentary gains obtained by deviating. 

The equilibrium payoff profile is (Vc, Vp) = (a(u — p),ap — c), which is pareto-optimal 
and socially efficient. □ 

Second, we can prove that the client never reports negative feedback in any pareto- 
optimal PPE, regardless the value of the discount factor. The restriction to pareto-optimal 
is justifiable by practical reasons: assuming that the client and the provider can somehow 
negotiate the equilibrium they are going to play, it makes most sense to choose one of the 
pareto-optimal equilibria. 

Proposition 2 The probability that the client reports negative feedback on the equilibrium 
path of any pareto-optimal PPE strategy is zero. 

Sketch of Proof. The full proof presented in Appendix A follows the following steps. 
Step 1, all equilibrium payoffs can be expressed by adding the present round payoff to the 
discounted continuation payoff from the next round onward. Step 2, take the PPE payoff 
profile V = (Vc,Vp), such that there is no other PPE payoff profile V' = (V c ,Vp) with 
Vc < Vq. The client never reports negative feedback in the first round of the equilibrium 
that gives V. Step 3, the equilibrium continuation payoff after the first round also satisfies 
the conditions set for V. Hence, the probability that the client reports negative feedback on 
the equilibrium path that gives V is 0. Pareto-optimal PPE payoff profiles clearly satisfy 
the definition of V, hence the result of the proposition. □ 

The third result we want to mention here, is that there is an upper bound on the 
percentage of false reports recorded by the reputation mechanism in any of the pareto- 
optimal equilibria. 

Proposition 3 The upper bound on the percentage of false reports recorded by the reputa- 
tion mechanism in any PPE equilibrium is: 




(3) 
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Sketch of Proof. The full proof presented in Appendix B builds directly on the result of 
Proposition 2. Since clients never report negative feedback along pareto-optimal equilibria, 
the only false reports recorded by the reputation mechanism appear when the provider 
delivers low quality, and the client reports positive feedback. However, any PPE profile 
must give the client at least Vc = u — p(l + p), otherwise the client is better off by resorting 
to the outside option. Every round in which the provider deliberatively delivers low quality 
gives the client a payoff strictly smaller than u — p(l + p). An equilibrium payoff greater 
than Vg_ is therefore possible only when the percentage of rounds where the provider delivers 
low quality is bounded. The same bound limits the percentage of false reports recorded by 
the reputation mechanism. □ 

For a more intuitive understanding of the results presented in this section, let us refer 
to the pizza delivery example detailed in Section 3.1. The price of a home delivered pizza is 
p = 1, while at the local restaurant the same pizza would cost p(l + p) = 1.2. The utility of 
a warm pizza to the client is u = 2, the cost of delivery is c = 0.8 and the probability that 
unexpected traffic conditions delay the delivery beyond the 30 minutes deadline (despite 
the best effort of the provider) is 1 — a = 0.01. 

The client can secure a minimax payoff of Vc = u — p(l + p) = 0.8 by always going 
out to the restaurant. However, the socially desired equilibrium happens when the client 
orders pizza at home, and the pizza service exerts effort to deliver pizza in time: in this 
case the payoff of the client is Vc = a(u — p) = 0.99, while the payoff of the provider is 
V P = ap-c = 0.19. 

Proposition 1 gives a lower bound on the discount factor of the pizza delivery service 
such that repeated clients can expect the socially desired equilibrium. This bound is 5 = 
p (i+a)~c = 0-84; assuming that the daily discount factor of the pizza service is 5 = 0.996, 
the same client must order pizza at home at least once every 6 weeks. The values of the 
discount factors can also be interpreted in terms of the minimum number of rounds the 
client (and the provider) will likely play the game. For example, the discount factor can be 
viewed as the probability that the client (respectively the provider) will "live" for another 
interaction in the market. It follows that the average lifetime of the provider is at least 
1/(1 — 5) = 250 interactions (with all clients), while the average lifetime of the client is at 
least 1/(1 — 5) = 7 interactions (with the same pizza delivery service). These are clearly 
realistic numbers. 

Proposition 3 gives an upper bound on the percentage of false reports that our mecha- 
nism may record in equilibrium from the clients. As u{l — a) = 0.02 < 0.2 = pp, this limit 
is: 

7 = ^=0.1; 

u 

It follows that at least 90% of the reports recorded by our mechanism (in any equilibrium) 
are correct. The false reports (false positive reports) result from rare cases where the pizza 
delivery is intentionally delayed to save some cost but clients do not complain. The false 
report can be justified, for example, by the provider's threat to refuse future orders from 
clients that complain. Given that late deliveries are still rare enough, clients are better off 
with the home delivery than with the restaurant, hence they accept the threat. As other 
options become available to the clients (e.g., competing delivery services) the bound 7 will 
decrease. 
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Please note that the upper bound defined by Proposition 3 only depends on the outside 
alternative available to the provider, and is not influenced by the punishment e introduced 
by the reputation mechanism. This happens because the revenue of a client is independent 
of the interactions of other clients, and therefore, on the reputation information as reported 
by other clients. Equilibrium strategies are exclusively based on the direct experience of 
the client. In the following section, however, we will refine this bound by considering that 
clients can build a reputation for reporting honestly. There, the punishment e plays an 
important role. 

5. Building a Reputation for Truthful Reporting 

An immediate consequence of Propositions 2 and 3 is that the provider can extract all of the 
surplus created by the transactions by occasionally delivering low quality, and convincing 
the clients not to report negative feedback (providers can do so by promising sufficiently high 
continuation payoffs that prevent the client to resort to the outside provider). Assuming 
that the provider has more "power" in the market, he could influence the choice of the 
equilibrium strategy to one that gives him the most revenue, and holds the clients close to 
the minimax payoff Vc = u — p(l + p) given by the outside option. 5 

However, a client who could commit to report honestly, (i.e., commit to play the strategy 
s* c = inO qo l qi ) would benefit from cooperative trade. The provider's best response against 
Sq is to play eil qo d qi repeatedly, which leads the game to the socially efficient outcome. 
Unfortunately the commitment to Sq is not credible in the complete information game, for 
the reasons explained in Section 4.2. 

Following the results of Kreps et al. (1982), Fudenberg and Levine (1989) and Schmidt 
(1993) we know that such honest reporting commitments may become credible in a game 
with incomplete information. Suppose that the provider has incomplete information in 
and believes with non-negative probability that he is facing a committed client that always 
reports the truth. A rational client can then "fake" the committed client, and "build a 
reputation" for reporting honestly. When the reputation becomes credible, the provider 
will play e\l qo d qi (the best response against s* c ), which is better for the client than the 
payoff she would obtain if the provider knew she was the "rational" type. 

As an effect of reputation building, the set of equilibrium points is reduced to a set 
where the payoff to the client is higher than the payoff obtained by a client committed to 
report honestly. As anticipated from Proposition 3, a smaller set of equilibrium points also 
reduces the bound of false reports recorded by the reputation mechanism. In certain cases, 
this bound can be reduced to almost zero. 

Formally, incomplete information can be modeled by a perturbation of the complete 
information repeated game G°° such that in period (before the first round of the game is 
played) the "type" of the client is drawn by nature out of a countable set 6 according to 
the probability measure ji. The client's payoff now additionally depends on her type. We 



5. All pareto-optimal PPE payoff profiles are also renegotiation-proof (Bernheim & Ray, 1989; Farrell & 
Maskin, 1989). This follows from the proof of Proposition 3: the continuation payoffs enforcing a pareto- 
optimal PPE payoff profile are also pareto-optimal. Therefore, clients falsely report positive feedback 
even under the more restrictive notion of negotiation-proof equilibrium. 
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say that in the perturbed game G°°(n) the provider has incomplete information because he 
is not sure about the true type of the client. 

Two types from have particular importance: 

• The "normal" type of the client, denoted by 9o, is the rational client who has the 
payoffs presented in Figure L 

• The "commitment" type of the client, denoted by 6*, always prefers to play the com- 
mitment strategy Sq. From a rational perspective, the commitment type client obtains 
an arbitrarily high supplementary reward for reporting the truth. This external re- 
ward makes the strategy Sq the dominant strategy, and therefore, no commitment 
type client will play anything else than s,q. 

In Theorem 1 we give an upper bound kp on the number of times the provider delivers 
low quality in G°°(fj,), given that he always observes the client reporting honestly. 

The intuition behind this result is the following. The provider's best response to a 
honest reporter is eil qo d Ql : always exert high effort, and deliver only when the quality is 
high. This gives the commitment type client her maximum attainable payoff in 
corresponding to the socially efficient outcome. The provider, however, would be better off 
by playing against the normal type client, against whom he can obtain an expected payoff 
greater than ap — c. 

The normal type client may be distinguished from a commitment type client only in 
the rounds when the provider delivers low quality: the commitment type always reports 
negative feedback, while the normal type might decide to report positive feedback in order 
to avoid the penalty e. The provider can therefore decide to deliver low quality to the client 
in order to test her real type. The question is, how many times should the provider test 
the true type of the client. 

Every failed test (i.e., the provider delivers low quality and the client reports negative 
feedback) generates a loss of — e to the provider, and slightly enforces the belief that the 
client reports honestly. Since the provider cannot wait infinitely for future payoffs, there 
must be a time when the provider will stop testing the type of the provider, and accepts to 
play the socially efficient strategy, e\l qo d qi . 

The switch to the socially efficient strategy is not triggered by a revelation of the client's 
type. The provider believes that the client behaves as if she were a commitment type, not 
that the client is a commitment type. The client may very well be a normal type who 
chooses to mimic the commitment type, in the hope that she will obtain better service from 
the provider. However, further trying to determine the true type of the client is too costly 
for the provider. Therefore, the provider chooses to play eil qo d qi , which is the best response 
to the commitment strategy s* c . 

Theorem 1 If the provider has incomplete information in G°° , and assigns positive prob- 
ability to the normal and commitment type of the client (fJ,(6o) > 0, = //(#*) > 0), there 
is a finite upper bound, kp, on the number of times the provider delivers low quality in any 
equilibrium o/G°°(/x). This upper bound is: 




(4) 
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Proof. First, we use an important result obtained by Fudenberg and Levine (1989) about 
statistical inference (Lemma 1): If every previously delivered low quality service was sanc- 
tioned by a negative report, the provider must expect with increasing probability that his 
next low quality delivery will also be sanctioned by negative feedback. Technically, for any 
7r < 1, the provider can deliver at most n(7r) low quality services (sanctioned by negative 
feedback) before expecting that the n(ir) + 1 low quality delivery will also be sanctioned by 
negative feedback with probability greater then n. This number equals to: 



n(-7r) = 



In// 
ln7T 



As stated earlier, this lemma does not prove that the provider will become convinced 

that he is facing a commitment type client. It simply proves that after a finite number of 

rounds the provider becomes convinced that the client is playing as if she were a commitment 

type. _ 

Second, if it > SVp - but is strictly smaller than 1, the rational provider does 
' SV P +(l-S)e J 1 

not deliver low quality (it is easy to verify that the maximum discounted future gain does 
not compensate for the risk of getting a negative feedback in the present round). By the 
previously mentioned lemma, it must be that in any equilibrium, the provider delivers low 
quality a finite number of times. 

Third, let us analyze the round, i, when the provider is about to deliver a low quality 
service (play d q °) for the last time. If ir is the belief of the provider that the client reports 
honestly in round t, his expected payoff (just before deciding to deliver the low quality 
service) can be computed as follows: 

• with probability ir the client reports 0. Her reputation for reporting honestly becomes 
credible, so the provider plays e\l q °d qi in all subsequent rounds. The provider gains 
p — i in the current round, and expects ap — c for the subsequent rounds; 

• with probability 1— ir, the client reports 1 and deviates from the commitment strategy, 
the provider knows he is facing a rational client, and can choose a continuation PPE 
strategy from the complete information game. He gains p in the current round, and 
expects at most Vp in the subsequent rounds; 



V P < (1 - 5){p - ire) + 5(n(ap - c) + (1 - tt)V p ) 



On the other hand, had the provider acknowledged the low quality and rolled back the 
transaction (i.e., play l qo ), his expected payoff would have been at least: 

V' P > (1 - 5)0 + S(ap - c) 



Since the provider chooses nonetheless to play d qo it must be that Vp > V' p which is 
equivalent to: 

5(Vp~ - ap + c) + (1 - 6)p 

TT < 7T = = (5) 

S(V P -ap + c) + (l-6)s J 
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Finally, by replacing Equation (5) in the definition of n(ir) we obtain the upper bound 
on the number of times the provider delivers low quality service to a client committed to 
report honestly. □ 

The existence of kp further reduces the possible equilibrium payoffs a client can get in 
G°°(fi). Consider a rational client who receives for the first time low quality. She has the 
following options: 

• report negative feedback and attempt to build a reputation for reporting honestly. 
Her payoff for the current round is — p — e. Moreover, her worst case expectation for 
the future is that the next kp — 1 rounds will also give her — p — e, followed by the 
commitment payoff equal to a(u — p): 

V c \0 = (1 - 6)(-p -e) + 6(1 - 5 kp - l )(-p - e) + S kp a(u-p); (6) 



• on the other hand, by reporting positive feedback she reveals to be a normal type, 
loses only p in the current round, and expects a continuation payoff equal to Vc given 
by a PPE strategy profile of the complete information game G°°: 

V c \l = (1 - 6)(-p) + SV C ; (7) 



The reputation mechanism records false reports only when clients do not have the in- 
centive to build a reputation for reporting honestly, and Vc\l > Vc\0; this is true for: 

V C > S^aiu - p) - (1 - S kp - 1 )(p + e) - ^—^e; 



Following the argument of Proposition 3 we can obtain a bound on the percentage of 
false reports recorded by the reputation mechanism in a pareto-optimal PPE that gives the 
client at least Vc'- 



" ( "-^ c iiV c >au-p; 
u -p-Vc iiV c <au-p 



7= ^—'^ (8) 



Of particular importance is the case when kp = 1. Vc and 7 become: 

x i-<* . (l-<y)e 

V c = a(u - p) - —s; 7 = g i (9) 

so the probability of recording a false report (after the first one) can be arbitrarily close to 
as e ->■ 0. 

For the pizza delivery example introduced in Section 3.1, Figure 3 plots the bound, kp, 
defined in Theorem 1, as a function of the prior belief (/Uq) of the provider that the client 
is an honest reporter. We have used a value of the discount factor equal to 8 = 0.95, such 
that on average, every client interacts 1/(1 — 6) = 20 times with the same provider. The 
penalty for negative feedback was taken e = 2.5. When the provider believes that 20% of 
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Figure 3: The upper bound kp as a function of the prior belief 

the clients always report honestly, he will deliver at most 3 times low quality. When the 
belief goes up to = 40% no rational provider will deliver low quality more than once. 

In Figure 4 we plot the values of the bounds 7 (Equation (3)) and 7 (Equation (8)) as 
a function of the prior belief /Xg. The bounds simultaneously hold, therefore the maximum 
percentage of false reports recorded by the reputation mechanism is the minimum of the 
two. When /Xq is less 0.25, kp > 2, 7 < 7, and the reputation effect does not significantly 
reduce the worst case percentage of false reports recorded by the mechanism. However, when 
fiQ G (0.25, 0.4) the reputation mechanism records (in the worst case) only half as many false 
reports, and as /ig > 0.4, the percentage of false reports drops to 0.005. This probability 
can be further decreased by decreasing the penalty e. In the limit, as e approaches 0, the 
reputation mechanism will register a false report with vanishing probability. 

The result of Theorem 1 has to be interpreted as a worst case scenario. In real markets, 
providers that already have a small predisposition to cooperate will defect fewer times. 
Moreover, the mechanism is self enforcing, in the sense that the more clients act as commit- 
ment types, the higher will be the prior beliefs of the providers that new, unknown clients 
will report truthfully, and therefore the easier it will be for the new clients to act as truthful 
reporters. 

As mentioned at the end of Section 4.2, the bound 7 strongly depends on the punishment 
e imposed by the reputation mechanism for a negative feedback. The higher s, the easier it 
is for clients to build a reputation, and therefore, the lower the amount of false information 
recorded by the reputation mechanism. 

6. The Threat of Malicious Clients 

The mechanism described so far encourages service providers to do their best and deliver 
good service. The clients were assumed rational, or committed to report honestly, and 
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Figure 4: The maximum probability of recording a false report as a function of the prior 
belief [i* . 



in either case, they never report negative feedback unfairly. In this section, we investigate 
what happens when clients explicitly try to "hurt" the providers by submitting fake negative 
ratings to the reputation mechanism. 

An immediate consequence of fake negative reports is that clients lose money. However, 
the costs e of a negative report would probably be too small to deter clients with sepa- 
rate agendas from hurting the provider. Fortunately, the mechanism we propose naturally 
protects service providers from consistent attacks initiated by malicious clients. 

Formally, a malicious type client, Op G 0, obtains a supplementary (external) payoff (5 
for reporting negative feedback. Obviously, (3 has to be greater than the penalty e, otherwise 
the results of Proposition 2 would apply. In the incomplete information game G°°(/i), the 
provider now assigns non-zero initial probability to the belief that the client is malicious. 

When only the normal type, Oq, the honest reporter type 6* and the malicious type 
6/3 have non-zero initial probability, the mechanism we describe is robust against unfair 
negative reports. The first false negative report exposes the client as being malicious, since 
neither the normal, nor the commitment type report after receiving high quality. By 
Bayes' Law, the provider's updated belief following a false negative report must assign 
probability 1 to the malicious type. Although providers are not allowed to refuse service 
requests, they can protect themselves against malicious clients by playing eol: i.e., exert 
low effort and reimburse the client afterwards. The RM records neutral feedback in this 
case, and does not sanction the provider. Against eol, malicious clients are better off by 
quitting the market (opt out), thus stopping the attack. The RM records at most one false 
negative report for every malicious client, and assuming that identity changes are difficult, 
providers are not vulnerable to unfair punishments. 
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When other types (besides 9o,9* and Or) have non-zero initial probability, malicious 
clients are harder to detect. They could masquerade client types that are normal, but 
accidentally misreport. It is not rational for the provider to immediately exclude (by playing 
eol) normal clients that rarely misreport: the majority of the cooperative transactions 
rewarded by positive feedback still generate positive payoffs. Let us now consider the 
client type 9q(u) G that behaves exactly like the normal type, but misreports instead 
of 1 independently with probability v. When interacting with the client type 6q{v), the 
provider receives the maximum number of unfair negative reports when playing the efficient 
equilibrium: i.e., e\l qo d qi . In this case, the provider's expected payoff is: 

Vp = ap — c — ve\ 

Since Vp has to be positive (the minimax payoff of the provider is 0, given by eol), it must 
be that v < ap ~ c . 

The maximum value of v is also a good approximation for the maximum percentage 
of false negative reports the malicious type can submit to the reputation mechanism. Any 
significantly higher number of harmful reports exposes the malicious type and allows the 
provider to defend himself. 

Note, however, that the malicious type can submit a fraction v of false reports only 
when the type 6q{v) has positive prior probability. When the provider does not believe that 
a normal client can make so many mistakes (even if the percentage of false reports is still 
low enough to generate positive revenues) he attributes the false reports to a malicious type, 
and disengages from cooperative behavior. Therefore, one method to reduce the impact of 
malicious clients is to make sure that normal clients make few or no mistakes. Technical 
means (for example by providing automated tools for formatting and submitting feedback) , 
or improved user interfaces (that make it easier for human users to spot reporting mistakes) 
will greatly limit the percentage of mistakes made by normal clients, and therefore, also 
reduce the amount of harm done by malicious clients. 

One concrete method for reducing mistakes is to solicit only negative feedback from the 
clients (the principle that no news is good news, also applied by Dellarocas (2005)). As 
reporting involves some conscious decision, mistakes will be less frequent. On the other 
hand, the reporting effort will add to the penalty for a negative report, and makes it harder 
for normal clients to establish a reputation for honest reporters. Alternative methods for 
reducing the harm done by malicious clients (like filtering mechanisms, etc., ) as well as 
tighter bounds on the percentage of false reports introduced by such clients will be further 
addressed in future work. 

7. Discussion and Future Work 

Further benefits can be obtained if the clients' reputation for reporting honestly is shared 
within the market. The reports submitted by a client while interacting with other providers 
will change the initial beliefs of a new provider. As we have seen in Section 5, providers 
cheat less if they a priory expect with higher probability to encounter honest reporting 
clients. A client that has once built a reputation for truthfully reporting the provider's 
behavior will benefit from cooperative trade during her entire lifetime, without having to 
convince each provider separately. Therefore the upper bound on the loss a client has to 
withstand in order to convince a provider that she is a commitment type, becomes an upper 
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bound on the total loss a client has to withstand during her entire lifetime in the market. 
How to effectively share the reputation of clients within the market remains an open issue. 

Correlated with this idea is the observation that clients that use our mechanism are 
motivated to keep their identity. In generalized markets where agents are encouraged to 
play both roles (e.g. a peer-2-peer file sharing market where the fact that an agent acts 
only as "provider" can be interpreted as a strong indication of "double identity" with the 
intention of cheating) our mechanism also solves the problem signaled by Friedman and 
Resnick (2001) related to cheap online pseudonyms. The price to pay for the new identity 
is the loss due to building a reputation as truthful reporter when acting as a client. 

Unlike incentive-compatible mechanism that pay reporters depending on the feedback 
provided by peers, the mechanism described here is less vulnerable to collusion. The only 
reason individual clients would collude is to badmouth (i.e., artificially decrease the rep- 
utation of) a provider. However, as long as the punishment for negative feedback is not 
super- linear in the number of reports (this is usually the case), coordinating within a coali- 
tion brings no benefits for the colluders: individual actions are just as effective as the actions 
when part of a coalition. The collusion between the provider and client can only acceler- 
ate the synchronization of strategies on one of the PPE profiles (collusion on a non-PPE 
strategy profile is not stable), which is rather desirable. The only profitable collusion can 
happen when competitor providers incentivize normal clients to unfairly downrate their 
current provider. Colluding clients become malicious in this case, and the limits on the 
harm they can do are presented in Section 6. 

The mechanism we describe here is not a general solution for all online markets. In 
general retail e-commerce, clients don't usually interact with the same service provider more 
than once. As we have showed along this paper, the assumption of a repeated interaction 
is crucial for our results. Nevertheless, we believe there are several scenarios of practical 
importance that do meet our requirements (e.g., interactions that are part of a supply chain). 
For these, our mechanism can be used in conjunction with other reputation mechanisms to 
guarantee reliable feedback and improve the overall efficiency of the market. 

Our mechanism can be further criticized for being centralized. The reputation mecha- 
nism acts as a central authority by supervising monetary transactions, collecting feedback 
and imposing penalties on the participants. However, we see no problem in implementing 
the reputation mechanism as a distributed system. Different providers can use different 
reputation mechanisms, or, can even switch mechanisms given that some safeguarding mea- 
sures are in place. Concrete implementations remain to be addressed by future work. 

Although we present a setting where the service always costs the same amount, our 
results can be easily extended to scenarios where the provider may deliver different kinds 
of services, having different prices. As long as the provider believes that requests are 
randomly drawn from some distribution, the bounds presented above can be computed 
using the average values of u, p and c. The constraint on the provider's belief is necessary 
in order to exclude some unlikely situations where the provider cheats on a one time high 
value transaction, knowing that the following interactions carry little revenue, and therefore, 
cannot impose effective punishments. 

In this paper, we systematically overestimate the bounds on the worst case percentage 
of false reports recorded by the mechanism. The computation of tight bounds requires a 
precise quantitative description of the actual set of PPE payoffs the client and the provider 
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can have in G°° . Fudenberg et al. (1994) and Abreu, Pearce, and Stacchetti (1990) pose 
the theoretical grounds for computing the set of PPE payoffs in an infinitely repeated game 
with discount factors strictly smaller than 1. However, efficient algorithms that allow us to 
find this set are still an open question. As research in this domain progresses, we expect to 
be able to significantly lower the upper bounds described in Sections 4 and 5. 

One direction of future research is to study the behavior of the above mechanism when 
there is two-sided incomplete information: i.e. the client is also uncertain about the type 
of the provider. A provider type of particular importance is the "greedy" type who always 
likes to keep the client to a continuation payoff arbitrarily close to the minimal one. In this 
situation we expect to be able to find an upper bound kc on the number of rounds in which a 
rational client would be willing to test the true type of the provider. The condition kp < kc 
describes the constraints on the parameters of the system for which the reputation effect 
will work in the favor of the client: i.e. the provider will give up first the "psychological" 
war and revert to a cooperative equilibrium. 

The problem of involuntary reporting mistakes briefly mentioned in Section 6 needs 
further addressing. Besides false negative mistakes (reporting instead of 1), normal clients 
can also make false positive mistakes (report 1 instead of the intended 0). In our present 
framework, one such mistake is enough ro ruin the reputation of a normal type client to 
report honestly. This is one of the reasons why we chose a sequential model where the 
feedback of the client is not required if the provider acknowledges low quality. Once the 
reputation of the client becomes credible, the provider always rolls back the transactions that 
generate (accidentally or not) low quality, so the client is not required to continuously defend 
her reputation. Nevertheless, the consequences of reporting mistakes in the reputation 
building phase must be considered in more detail. Similarly, mistakes made by the provider, 
monitoring and communication errors will also influence the results presented here. 

Last, but not the least, practical implementations of the mechanism we propose must 
address the problem of persistent online identities. One possible attack created by easy 
identity changes has been mentioned in Section 6: malicious buyers can continuously change 
identity in order to discredit the provider. In another attack, the provider can use fake 
identities to increase his revenue. When punishments for negative feedback are generated 
endogenously by decreased prices in a fixed number of future transactions (e.g., Dellarocas, 
2005), the provider can adopt the following strategy: he cheats on all real customers, but 
generates a sufficient number of fake transactions in between two real transactions, such 
that the effect created by the real negative report disappears. An easy fix to this latter 
attack is to charge transaction or entrance fees. However, these measures also affect the 
overall efficiency of the market, and therefore, different applications will most likely need 
individual solutions. 

8. Conclusions 

Effective reputation mechanisms must provide appropriate incentives in order to obtain 
honest feedback from self-interested clients. For environments characterized by adverse 
selection, direct payments can explicitly reward honest information by conditioning the 
amount to be paid on the information reported by other peers. The same technique un- 
fortunately does not work when service providers have moral hazard, and can individually 
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decide which requests to satisfy. Sanctioning reputation mechanisms must therefore use 
other mechanisms to obtain reliable feedback. 

In this paper we describe an incentive-compatible reputation mechanism when the clients 
also have a repeated presence in the market. Before asking feedback from the clients, we 
allow the provider to acknowledge failures and reimburse the price paid for service. When 
future transactions generate sufficient profit, we prove that there is an equilibrium where 
the provider behaves as socially desired: he always exerts effort, and reimburses clients that 
occasionally receive bad service due to uncontrollable factors. Moreover, we analyze the 
set of pareto-optimal equilibria of the mechanism, and establish a limit on the maximum 
amount of false information recorded by the mechanism. The bound depends both on the 
external alternatives available to clients and on the ease with which they can commit to 
reporting the truth. 

Appendix A. Proof of Proposition 2 

The probability that the client reports negative feedback on the equilibrium path of any 
pareto-optimal PPE strategy is zero. 
Proof. 

Step 1. Following the principle of dynamic programming (Abreu et al., 1990), the payoff 
profile V = (Vc, Vp) is a PPE of G°°, if and only if there is a strategy profile a in G, and 
the continuation PPE payoffs profiles {W(y)\y G Y} of G°°, such that: 

• V is obtained by playing a in the current round, and a PPE strategy that gives W(y) 
as a continuation payoff, where y is the public outcome of the current round, and 
Pr[y\a] is the probability of observing y after playing a: 

V c = (1 - 5)g c (<i) + 5(J2 Pr[vW] ■ W c (y)); 
ver 

V P = (1- S)gp(a) +5(J2 PrlvW ' W P (yj) ; 

• no player finds it profitable to deviate from a: 

V c > (1 - 5)gc((a' c ,ap))+6( ]T Pr[y\(a' c ,a P )] ■ W c (y)); Vv'c + «c 

veY 

V P >(1- S)gp((a c ,a' P )) +<$(^ Pr[y\(a c ,a' P )] ■ W P {y)); Va' P ? a P 

V&Y 

The strategy a and the payoff profiles {W(y)\y € Y} are said to enforce V. 

Step 2. Take the PPE payoff profile V = {V C ,V P ), such that there is no other PPE 
payoff profile V' = (Vq, Vp) with Vc < V' c . Let a and {W(y)\y G Y} enforce V, and assume 
that a assigns positive probability (3q = Pr[qoQ\cr] > to the outcome c/oO. If fi\ = Pr[qol\a] 
(possibly equal to 0), let us consider: 

• the strategy profile a' = (a' c ,ap) where a' c is obtained from ac by asking the client 
to report 1 instead of when she receives low quality (i.e., qo); 
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• the continuation payoffs {W'(y)\y € Y} such that W!(q Q l) = P Wi(q 0) + PiWi(q l) 
and W((y / qol) = Wi(y) for i G {C, P}. Since, the set of correlated PPE payoff 
profiles of G°° is convex, if W(y) are PPE payoff profiles, so are W'(y). 

The payoff profile (Vq, Vp), Vq = Vq + (1 — S)(3qE is a PPE equilibrium profile because 
it can be enforced by a' and {W(y)\y € Y}. However, this contradicts our assumption that 
Vq < Vc, so Pr[goO|cr] must be 0. Following exactly the same argument, we can prove that 
Pr[qi0\a] = 0. 

Step 3. Taking V, a and {W(y)\y € Y} from step 2, we have: 

V c = (1 - S)g c (a) + tf( ]T i^foM ■ ^c(y)) ; (10) 

If no other PPE payoff profile V = (V c ,Vp) can have V' c > Vc, it must be that the 
continuation payoffs W{y) satisfy the same property. (Assume otherwise that there is a 
PPE (W c (y),Wp(y)) with W' c {y) > W c (y). Replacing W' c {y) in (10) we obtain V that 
contradicts the hypothesis). 

By continuing the recursion, we obtain that the client never reports on the equilibrium 
path that enforces a payoff profile as defined in Step 2. Pareto-optimal payoff profiles clearly 
enter this category, hence the result of the proposition. □ 



Appendix B. Proof of Proposition 3 

The upper bound on the percentage of false reports recorded by the reputation mechanism 
in any PPE equilibrium is: 

(l- aKp -u) +PP ifpp < u{1 _ a) . 

2fi ifpp>u{l-a) 

Proof. Since clients never report negative feedback along pareto-optimal equilibria, 
the only false reports recorded by the reputation mechanism appear when the provider 
delivers low quality, and the client reports positive feedback. Let a = (ac, <Jp) be a pareto- 
optimal PPE strategy profile, a induces a probability distribution over public histories 
and, therefore, over expected outcomes in each of the following transactions. Let (i t be the 
probability distribution induced by a over the outcomes in round t. Ht(qoO) = ^(giO) = 
as proven by Proposition 2. The payoff received by the client when playing a is therefore: 

oo 

fJ-t(qo±)(-p) + m(qil)(u -p) + u t (l)0 + u t (out)(u -p- pp)j ; 

t=o 

where n t (q l) + ix t (qil) + ii t (l) + ii t (out) = land /u t (gol)+Mt(0 > (l-a)A*t(9il)/«, because 
the probability of go is & t least (1 — a) /a times the probability of q\. 

When the discount factor, 5, is the probability that the repeated interaction will stop 
after each transaction, the expected probability of the outcome qol is: 

oo 

7 = (i-5)£*V*(*> 1 ); 

t=0 
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Since any PPE profile must give the client at least Vc = u—p(\+p), (otherwise the client 
is better off by resorting to the outside option), Vc(o~) > Vc- By replacing the expression 
of Vc(cr), and taking into account the constraints on the probability of q\ we obtain: 
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